Speaker Independent Spee Using Features Based on Glo
نویسندگان
چکیده
We discussed utilization of features based on the glottal sound source for speaker independent speech recognition. It has been thought that such features as pitch cannot contribute to speaker independent speech recognition because of the dominant speaker dependent factor. In this paper, we tried to utilize pitch, power, LPC residual power, voicing rate, and their regression coefficients as feature parameters for speaker independent speech recognition, and found that regression parameters of F0, power and LPC residual power could improve the performance, especially using covariances between each parameter and conventional MFCC. This showed that the procedure to derive the regression parameters could reduce the speaker dependent factor which appeared as biases of those features, and that the correlation between glottal source information and spectral envelope information (MFCC) worked well. We also tested the parameters on a large-vocabulary continuous speech recognition task and obtained the performance improvement.
منابع مشابه
Speaker Independent Speech Recognition Using Hidden Markov Models for Persian Isolated Words
متن کامل
Phonetic Speaker Id
This paper describes the exploration of text-independent speaker identification using novel approaches based on speakers’ phonetic features instead of traditional acoustic features. Different phonetic speaker identification approaches are discussed in this paper and evaluated using two speaker identification systems: one multilingual system and one single language multiple-engine system. Furthe...
متن کاملImproved Gender Independent Speaker Recognition Using Convolutional Neural Network Based Bottleneck Features
This paper proposes a novel framework to improve performance of gender independent i-Vector PLDA based speaker recognition using convolutional neural network (CNN). Convolutional layers of a CNN offer robustness to variations in input features including those due to gender. A CNN is trained for ASR with a linear bottleneck layer. Bottleneck features extracted using the CNN are then used to trai...
متن کاملDissertation Summary Recognizing Non-native Speech: Characterizing and Adapting to Non-native Usage in Lvcsr
Low-pro ien y non-native speakers represent a signi ant hallenge for large-vo abulary ontinuous spee h re ognition (LVCSR). A ousti models are onfused by a heavy a ent; language models are onfused by poor grammar and un onventional word hoi e. La k of omfort with the spoken language a e ts the fundamental properties of onne ted spee h that have been a fo us of LVCSR resear h; ross-word and inte...
متن کامل